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We discuss the geometry of trees endowed with a causal structure using 
the conventional framework of equilibrium statistical mechanics. We show 
how this ensemble is related to popular growing network models. In par- 
ticular we demonstrate that on a class of afine attachment kernels the two 
models are identical but they can differ substantially for other choice of 
weights. We show that causal trees exhibit condensation even for asymp- 
totically linear kernels. We derive general formulae describing the degree 
distribution, the ancestor-descendant correlation and the probability a ran- 
domly chosen node lives at a given geodesic distance from the root. It is 
shown that the Hausdorff dimension dn of the causal networks is generically 
infinite. 

PACS numbers: 02.50.Cw,05.40.-a,05.50-Fq,87.18.Sn 

1. Introduction 

The study of networks is becoming increasingly popular (for a recent 
review see e.g. |lj and also |2j). The main reason for that is an emergence 
of great wealth of data on Internet, WWW, science citation networks, cell 
metabolism networks and so on. Most of those networks (if not all) exhibit 
features that are not explained by the classical theory of random graphs due 
to Erdos and Renyi . Perhaps the most prominent among those features is 
the power like degree distribution. Degree of a vertex is the number of links 
connected to it. While classical theory predicts a Poissonian distribution 
for the degree of a vertex in a random graph, in many of naturally occurring 
networks this distribution was found to have power-like tails. One way to 
understand this is based on an ancient observation "For unto every one that 
hath shall be given, and he shall have abundance: but from him that hath 
not shall be taken away even that which he hath."[^: a popular web page 
is more probable to attract more links to it, a frequently cited paper is 
more likely to get more citations and so on. In more modern context this 
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principle was formulated in 5 and adopted to the description of networks in 
[S] and [7] . This is a diachronic approach concentrating on the description of 
growing networks. This is very natural as most of the networks we encounter 
are a result of some growth process. The simple models studied in the 
literature try to capture the essential features of the growth mechanism. 

One can look at the networks in a different way that is also quite nat- 
ural as this is the approach taken in statistical mechanics and probability 
theory. In this synchronic view we treat each network as a single element of 
a statistical ensemble 8 . The ensemble is defined by specifying the "phase 
space" that is the class of graphs belonging to it and the weight (or prob- 
ability) for each graph in the ensemble. The probabilities can be assigned 
ad hoc or, what is more interesting, derived from other principles. In par- 
ticular it is clear that each growing network model defines also a statistical 
ensemble. The ensemble consists of all the graphs that can be constructed 
by the specified growth process and the probability assigned to each graph 
is the probability of constructing given graph. Thus the growth mechanism 
implicitly defines the probability for each graph. We find it worthwhile to 
study what kind of ensembles can be obtained from the growing network 
models j9,. The motivation for this is twofold. First using another "tool- 
box" one can obtain more insight into original models. Indeed we are in 
position to make some general statements about the correlation functions in 
growing random networks (GN) models. Secondly while, as stated, natural 
networks are usually grown, often we may not have an access to the growth 
history and we are effectively left with the statistical ensemble approach. 

In this contribution which is mostly based on j9] we will describe a 
statistical ensemble that incorporates the causal structure inherent in the 
GN models. We will show how it relates to the GN models and derive some 
results on correlation functions. 

2. Causal trees 

2.1. Definition 

First we review very quickly the growing random network model [Tj 
llUj . In the simplest version we start with a single vertex and then at each 
stage we attach a new vertex to one of the already existing ones. The 
probability for attaching the new Vn+i vertex to some old one Vi depends 
only on the degree of the node Vi and is proportional to An-, which is 
called attachment kernel. 

It is clear that this process produces a rooted, labeled tree, each node 
being labeled by the time at which it was inserted into the network and 
the first node being the root. It is also clear that not all the labelings are 
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possible: the label of the "father" must be smaller then the label of its child. 
We will call trees that satisfy this condition causal. 

In order to define the weights we must calculate the probability for 
constructing a given tree T. To obtain a node t with degree n we must have 
attached a new node to the node with degree n — 1. The probability of this 
happening at the time t is : 

^'(-1*) = V. ^"-^ , (1) 

where T{t — 1) is the tree at the partial stage of the construction : just 
before attaching the new node t. Unfortunately the normalizing factor in 
the denominator in general depends on the exact structure of the tree T{t). 
In consequence the overall probability of building a tree will depend on 
the way it was constructed, in particular on the labeling. This problem is 
exemplified in the figure H This is situation apart from being impossible to 
work with is quite unnatural. What we would like is to have a weight that 
depend only on the nodes degree and do factorise : 

N 

p{T) = p{ni ,...,nN) ='[lqni (2) 

i=l 

It turns out that there exists a class of GN models that is compatible with 
the above requirement. Those are the models with afine attachment kernels 
i.e. of the form : 

An = n + LJ, uj > —1 (3) 

where w is a constant. For such kernels the normalization factor depends 
only on the size of the tree : 

= ^ni + ^L^ = 2iV-2 + 7Vu; (4) 



For this class of attachment kernels the choice 

n-l 

9n = Ak (5) 

k=l 

leads to a model identical with the original GN model. For other kernels we 
will still define our model by the formulas @ and (jSJ. In this situation we 
can only expect some form of asymptotic or qualitative agreement between 
the models if any. In fact we as we will show later the two models can differ 
significantly even for very simple non- afine kernels. 
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Fig. 1. Two ways of constructing same (non labeled) tree. The labels below each 
tree show the probability of obtaining this tree from the precedent one. 



2.2. Recursion relation 

Most of the properties of the ensemble can be derived from the canonical 
partition function : 

T 

where we sum over all the non labeled trees and L{T) is the number of 
distinct causal labelings of the given tree T. 

We start by deriving a recursion relation for L(T). It is convenient at 
this stage to change to the planted ensemble. This amount to attaching 
to the root vertex an additional link : a stem. This does not change in 
any way the properties of the ensemble in the large N limit but makes the 
calculations easier. From k planted trees Ti, . . . , we can construct a new 
tree T = Ti (g) • • • by joining together the stems at a new node (see 
figure [21) • The number of causal labelings of the resulting tree is : 

1 

e . . . © n) = • • • m) (7) 

with A^i + • • • + Nk = N. One has to give + 1 labels to the nodes of the 
compound tree. However, the smallest label must be attached to the root. 
The remaining N labels are arbitrarily distributed among the trees. This 
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Fig. 2. Operation Ti, . . . ,Tk Ti® ■ ■ ■ ®Tk 
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Fig. 3. Recursion relation for zat+i 



is the origin of the the multinomial factor. Permuting the trees Tj does not 
change the compound tree. This explains the presence of the factor l/k\. 

Because of the property ((21) the weights of the new tree obviously fac- 
torise as : 

/9(ri e • • • © Tk) = Pk+ip{Ti) ■ ■ ■ p{Tk) (8) 

The partition function Zj\f^i can be constructed by summing the trees 
of size smaller or equal to N (see figure : 



ZN+l 



(iV + 



^ oo 



fc=lTi,...,Tfc 



X p(ri • • • © Tk)L{Ti © • 
Inserting Q and Q and rearranging the terms we arrive at : 



•Tk) (9) 



^ oo k 
k=l ' Ni,...,Nk i=l 



(10) 



Adding zi = qi and summing both sides of equation (jTU)) we get 



J2 NzNe-^f" = e"^ 



N 



Qk+1 



where 



\fc=0 

Z(^) = ^z^e-^^ 



Zip)' 
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is the grand-canonical partition function. Finally 

Z\^i) = -e'^^F{Z) (13) 

where 

oo 

Equation (fTH|) can be integrated to give 

e-<^..G(Z)./^^ ,15, 

The function G{Z) is a positive monotonically growing function of Z, bounded 
from above (one can ignore the trivial case where all g„ except q\ and q2 
are zero). Hence [i is bounded from below: Z{^[i) has a singularity at some 
fi = fl. Denote by x the radius of convergence of the series F{Z). The 
critical value of // is given by 

/i = -logG(x) (16) 

This formula holds also when the radius of convergence x is infinite, since 
all terms in the series p4() are positive and the integral in (jlGf) is convergent 
in all cases of interest : G(oo) < oo. Please note that /2 is a "free energy" 
density : 

1 

^ = J™ irA^^^N (17) 

2.3. Degree distribution 
The vertex degree distribution is calculated using : 

djl 

oqn 

which gives 



G{x) (n-1)! Jo F(a;)2' 

Again, this formula is also valid when x = oo. 

Summing over n and using the definitions of F and G one easily checks 
that TTn is normalized to unity, as it should. One further finds 



(20) 
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On a tree, the r.h.s. should equal 2. This is the case when F{x) diverges at 
X = X and hence c = 0. Otherwise one encounters a pathology (anomaly), 
which looks similar to that appearing in some maximally random tree mod- 
els (and in the so-called balls- in-boxes model, see Om), where in the large 
N limit one misses singular node(s) contributing term(s) of the type 

N-'^5{n - cN) (21) 

Such nonuniformly behaving terms disappear if one first takes the — > oo 
limit in ()2U() . It will be shown later that the average distance between 
nodes is finite when F{x) < oo . This means that singular node(s) - with 
unbounded connectivity - are indeed expected to show up. 




2.4- Condensation 

In order to check if the described anomaly really signals an appearance 
of a singular vertex we have studied causal trees with the weights 



(22) 



where d is an integer greater or equal to two, derived using © from the 
delayed linear attachment kernel 

An = l' , (23) 
\n — d + 1 n > d 

In this case the coefficients of the power series p4|) behave like : 

{k-dy. 1 



{k-iy. k<i-^ 



for A; ^ oo (24) 



The term c in (|19|) is obviously not zero and its value is found to be c ~ 
0.584692. In the figure \^ we have ploted the result of simulations of the 
model of causal trees with 1000 and 4000 nodes (circles and diamonds). 
One can clearly see peaks corresponding to the singular node. The vertical 
lines mark the positions of those peaks predicted from (|21j) . The perfect 
agreement confirms our statement that the "link deficiency" in ((20} signals 
the existence of a singular node. 

It is interesting to ask what is the shape of the vertex degree distribution 
for the growing random model which has exactly the same kernel H23|) . Of 
course in this case the two models are not identical because the kernel is 
not afine. The resulting degree distribution for this model is ploted in the 
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Fig. 4. Vertex degree distribution for causal trees ( 1000 nodes (circles), 4000 nodes 
(diamonds)) and GN model (4000 nodes (triangles)) 

figure m (triangles) The continuous line is the approximate solution 

taken from (TOj. The observed discrepancy is due to finite size effects. As 
we can see the distributions in causal trees and growing networks differ 
greatly, in particular there is no singular vertex in the GN model for this 
choice of kernel. 



2.5. Degree correlations 

Now, we turn to the calculation of the ancestor-descendant degree cor- 
relation. It is obvious that an ancestor plays the role of the root of the 
subgraph involving all its descendants. One can read from 1)10(1 the degree 
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distribution of the root: 

1 qi _ 
h! 



= J^ n ^ ^n^+-+Ni_^,n-iWZn, (25) 



(Z 

Going over to the grand-canonical ensemble one finds: 
which, taking (|13() into account and after integration yields 

Using similar arguments one writes the weight of graphs where the root has 
the degree / and its daughter the degree k as 

qi 



Zki{N) = ^ _ ^ 6ni+-+n,_i,n-1' 



(28) 



1-2 

xJ\ZnMNi-i) 

1=1 

Hence 

Integrating the above equation one finally obtains 

which is the conditional probability, up to normalization, that a descendant 
has the degree k when the ancestor's degree is I. The normalization is 
determined summing over k on the r.h.s. above, with the result (/ — l)Zi{fj,). 

2.6. Fractal dimension 

Repeating over and over the iteration process leading to eq. (|3Up one 

gets 



Zkxk2...kr{KZ)) 



ii(A:,-2)! {k,-\)\k 'F{xr) 



kr-\-2 ki-1 

X / dxr-1 ■■■ dxi^^^ (31) 



F{Xr-l) Jo F{xi) 
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Summing over node degrees ki,k2, - ■ ■ ,kr one obtains the weight of all 
graphs with a point separated by r links from the root, i.e. the two-point 
correlation function C{r,fi) introduced in sect. 1.2 : 

^ , F'jXr) , F'{Xr-l) 

F'iXr) Jo F{Xr-l) 

For finite x, replacing the upper limit of integration over xi by x and per- 
forming all integrations, one gets 

C(..MZ))<^ ''"f_^;jr (33) 

Hence, the tail of C(r, /i) falls at least as fast as a Poissonian. Consequently 
(r)^ grows at most like hiF{Z). Assuming that F{z) has at most a power 
singularity at z = x one concludes that 

< const In — (34) 
on 

and therefore 

(r)Ar < constlniV (35) 

since (5/U scales like N^^. The argument is rather heuristic, but suggestive 
(see also the examples in the section IIF of ref j^]). It appears that generi- 
cally the causal trees have the small-world property dn = oo, contrary to the 
maximum entropy trees whose generic fractal dimension is finite |5l ll3l[n| . 
This phenomenon is easy to understand intuitively : the causal structure 
suppresses long branches. This can be seen by noting that along a branch 
from the root to the leaf no label permutations are possible, hence a tree 
with a few long branches admits much less causal labelings then a "short 
fat" one. 



3. Summary 

We have studied a statistical ensemble of tree graphs endowed with a 
causal structure. We have derived some general formulas describing the de- 
gree distribution, the ancestor-descendant correlation, and the probability 
that a node lives at a given geodesic distance from the root. Using these 
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last results, we have shown that our causal networks have generically the 
small-world property i.e. then Hausdorff dimension is infinite. 

We have shown that our model coincides with the growing random model 
for a afine class of attachment kernels. Outside this class however the models 
can wildly differ. In particular we have demonstrated that while conden- 
sation of links can be observed in causal trees it is not to be seen in their 
growing network analogue (i.e. for the same weights). 
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